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Abstract 


Kernelization is a theoretical formalization of efficient preprocessing for NP-hard prob¬ 
lems. Empirically, preprocessing is highly successful in practice, for example in state-of- 
the-art ILP-solvers like CPLEX. Motivated by this, previous work studied the existence of 
kernelizations for ILP related problems, e.g., for testing feasibility of Ax < b. In contrast 
to the observed success of CPLEX, however, the results were largely negative. Intuitively, 
practical instances have far more useful structure than the worst-case instances used to prove 
these lower bounds. 

In the present paper, we study the effect that subsystems that have (a Gaifman graph 
of) bounded treewidth or that are totally unimodular have on the kernelizability of the 
ILP feasibility problem. We show that, on the positive side, if these subsystems have a 
small number of variables on which they interact with the remaining instance, then we can 
efficiently replace them by smaller subsystems of size polynomial in the domain without 
changing feasibility. Thus, if large parts of an instance consist of such subsystems, then 
this yields a substantial size reduction. Complementing this we prove that relaxations to 
the considered structures, e.g., larger boundaries of the subsystems, allow worst-case lower 
bounds against kernelization. Thus, these relaxed structures give rise to instance families 
that cannot be efficiently reduced, by any approach. 

1 Introduction 

The notion of kernelization from parameterized complexity is a theoretical formalization of 
preprocessing (i.e., data reduction) for NP-hard combinatorial problems. Within this framework 
it is possible to prove worst-case upper and lower bounds for preprocessing; see, e.g., recent 
surveys on kernelization [laiiH]. Arguably one of the most successful examples of preprocessing 
in practice are the simplification routines within modern integer linear program (ILP) solvers 
like CPLEX (see also [11 112( fT^). Since ILPs have high expressive power, already the problem 
of testing feasibility of an ILP is NP-hard; there are immediate reductions from a variety of well- 
known NP-hard problems. Thus, the problem also inherits many lower bounds, in particular, 
lower bounds against kernelization. 

INTEGER LINEAR PROGRAM FEASIBILITY - ILPF 
Input: A matrix A G ^ vector b G Z™. 

Question: Is there an integer vector a: G Z"" with Ax < bl 
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Despite this negative outlook, a formal theory of preprocessing, such as kernelization aims 
to be, needs to provide a more detailed view on one of the most successful practical examples of 
preprocessing, even if worst-case bounds will rarely match empirical results. With this premise 
we take a structural approach to studying kernelization for ilpf. We pursue two main structural 
aspects of ILPs. The first one is the treewidth of the so-called Gaifman graph underlying the 
constraint matrix A. As a second aspect we consider ILPs whose constraint matrix has large 
parts that are totally unimodular. Both bounded treewidth and total unimodularity of the whole 
system Ax < b imply that feasibility (and optimization) are tractable^ We study the effect of 
having subsystems that have bounded treewidth or that are totally unimodular. We determine 
when such subsystems allow for a substantial reduction in instance size. Our approach differs 
from previous work [Ellle] in that we study structural parameters related to treewidth and 
total unimodularity rather than considering parameters such as the dimensions n and m of the 
constrain matrix A or the sparsity thereof. 


Treewidth and ILPs. The Gaifman graph G{A) of a matrix A G ig a graph with one 

vertex per column of A, i.e., one vertex per variable, such that variables that occur in a common 
constraint form a clique in G{A) (see Section 3.1). This perspective allows us to consider the 


structure of an ILP by graph-theoretical means. In the context of graph problems, a frequently 
employed preprocessing strategy is to replace a simple (i.e., constant-treewidth) part of the 
graph that attaches to the remainder through a constant-size boundary, by a smaller gadget 
that enforces the same restrictions on potential solutions. There are several meta-kernelization 
theorems (cf. [13] ) stating that large classes of graph problems can be effectively preprocessed by 
repeatedly replacing such protrusions by smaller structures. It is therefore natural to consider 
whether large protrusions in the Gaifman graph G{A), corresponding to subsystems of the ILP, 
can safely be replaced by smaller subsystems. 

We give an explicit dynamic programming algorithm to determine which assignments to the 


boundary variables (see Section 3.3) of the protrusions can be extended to feasible assignments to 
the remaining variables in the protrusion. Then we show that, given a list of feasible assignments 
to the boundary of the protrusion, the corresponding subsystem of the ILP can be replaced by 
new constraints. If there are r variables in the boundary and their domain is bounded by d, we 
find a replacement system with 0{r-d'') variables and constraints that can be described in 0{d^^) 
bits. By an information-theoretic argument we prove that equivalent replacement systems 
require id{d'^) bits to encode. Moreover, we prove that large-domain structures are indeed 
an obstruction for effective kernelization by proving that a family of instances with a single 
variable of large domain (all others have {0,1}), and with given Gaifman decompositions into 
protrusions and a small shared part of encoding size N, admit no kernelization or compression 
to size polynomial in N. 

On the positive side, we apply the replacement algorithm to protrusion decompositions of 
the Gaifman graph to shrink ilpf instances. When an ilpf instance can be decomposed into 
a small number of protrusions with small boundary domains, replacing each protrusion by a 
small equivalent gadget yields an equivalent instance whose overall size is bounded. The recent 
work of Kim et al. |13j on meta-kernelization has identified a structural graph parameter such 
that graphs from an appropriately chosen family with parameter value k can be decomposed 
into 0{k) protrusions. If the Gaifman graph of an ilpf instance satisfies these requirements, the 
ILPF problem has kernels of size polynomial in k. Concretely, one can show that bounded-domain 
ILPF has polynomial kernels when the Gaifman graph excludes a fixed graph H as a topological 
minor and the parameter k is the size of a modulator of the graph to constant treewidth. We 


^Small caveat: For bounded treewidth this also requires bounded domain. 
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do not pursue this application further in the paper, as it follows from our reduction algorithms 
in a straight-forward manner. 

Total unimodularity. Recall that a matrix is totally unimodular (TU) if every square sub¬ 
matrix has determinant 1, —1, or 0. If ^4 is TU then feasibility of Ax < 5, for any integral vector 
b, can be tested in polynomial time. (Similarly, one can efficiently optimize any function c^x 
subject to Ax < b.) We say that a matrix A is totally unimodular plus p columns if it can be 
obtained from a TU matrix by changing entries in at most p columns. Clearly, changing a single 
entry may break total unimodularity, but changing only few entries should still give a system 
of constraints Ax < b that is much simpler than the worst-case. Indeed, if, e.g., all variables 
are binary (domain {0,1}) then one may check feasibility by simply trying all 2^ assignments 
to variables with modified column in A. The system on the remaining variables will be TU and 
can be tested efficiently. 

From the perspective of kernelization it is interesting whether a small value of p allows a 
reduction in size for Ax < b or, in other words, whether one can efficiently find an equivalent 
system of size polynomial in p. We prove that this depends on the structure of the system 
on variables with unmodified columns. If this remaining system decomposes into separate 
subsystems, each of which depends only on a bounded number of variables in non-TU columns, 
then by a similar reduction rule as for the treewidth case we get a reduced instance of size 
polynomial in p and the domain size d. Complementing this we prove that in general, i.e., 
without this bounded dependence, there is no kernelization to size polynomial vap + d] this also 
holds even if p counts the number of entry changes to obtain A from a TU matrix, rather than 
the (usually smaller) number of modified columns. 

Related work. Several lower bounds for kernelization for ilpf and other ILP-related prob¬ 
lems follow already from lower bounds for other (less general) problems. For example, unless 
NP C coNp/poly and the polynomial hierarchy collapse^ there is no efficient algorithm that 
reduces every instance {A, b) of ilpf to an equivalent instance of size polynomial in n (here n 
refers to the number of columns in A)] this follows from lower bounds for hitting set [9] or for 
SATISFIABILITY [8] and, thus, holds already for binary variables (0/1-ilpf). The direct study 
of kernelization properties of ILPs was initiated in [laile] and focused on the influence of row- 
and column-sparsity of A on having kernelization results in terms of the dimensions n and m 
of A. At high level, the outcome is that unbounded domain variables rule out essentially all 
nontrivial attempts at polynomial kernelizations. In particular, ilpf admits no kernelization 
to size polynomial in n -|- m when variable domains are unbounded, unless NP C coNP/poly; 
this remains true under strict bounds on sparsity m- For bounded domain variables the situ¬ 
ation is a bit more positive: there are generalizations of positive results for d-HiTTiNG set and 
d-SATiSFiABiLiTY (when sets/clauses have size at most d). One can reduce to size polynomial 
in n in general [l6], and to size polynomial in k when seeking a feasible x > 0 with |x|i < k for 
a sparse covering ILP |15j . 

Organization. Section contains preliminaries about parameterized complexity, graphs, and 
treewidth. In Section we analyze the effect of treewidth on preprocessing ILPs, while we 
consider the effect of large totally unimodular submatrices in Section]^ In Sectionwe discuss 
some differences between totally unimodular and bounded-treewidth subsystems. We conclude 
in Section [6l 

^NP ^ coNP/poly is a standard assumption in computational complexity. It is stronger than P 7^ NP and 
NP ^ coNP, and it is known that NP C coNP/poly implies a collapse of the polynomial hierarchy. 
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2 Preliminaries 


Parameterized complexity and kernelization. A parameterized problem is a set Q C S* x 
N where S is any finite alphabet and N denotes the non-negative integers. In an instance (x, k) G 
S* X N the second component is called the parameter. A parameterized problem Q is fixed- 
parameter tractable (FPT) if there is an algorithm that, given any instance (x, k) € T,* x N, 
takes time f{k)\x\‘^^^'l and correctly determines whether {x,k) G Q; here / is any computable 
function. A kernelization for Q is an algorithm K that, given (x, k) £ T,* x N, takes time 
polynomial in |x| -|- k and returns an instance {x\k') G S* x N such that (x,A:) G Q if and 
only if {x',k') G Q (i.e., the two instances are equivalent) and |x'| -\-k' < h{k)-, here h is any 
computable function, and we also call it the size of the kernel. If h{k) is polynomially bounded 
in k, then K is a polynomial kernelization. We also define (polynomial) compression] the only 
difference with kernelization is that the output is any instance x' G S'* with respect to any 
fixed language £, i.e., we demand that {x,k) G Q if and only if x' G £ and that |x'| < h{k). 
A polynomial-parameter transformation from a parameterized problem P to a parameterized 
problem Q is a polynomial-time mapping that transforms each instance (x, k) of P into an 
equivalent instance (x',fc') of Q, with the guarantee that {x,k) G P if and only if (x',/c') G Q 
and k' < p{k) for some polynomial p. 

Lower bounds for kernelization. For one of our lower bound proofs we use the notion of 
a cross-composition from |7], which builds on the framework for lower bounds for kernelization 
by Bodlaender et al. [5] and Fortnow and Santhanam Ha- 

Definition 1. An equivalence relation TZ on S* is called a polynomial equivalence relation if 
the following two conditions hold: 

1. There is an algorithm that given two strings x, ?/ G S* decides whether x and y belong to 

the same equivalence class in (|x| -|- time. 

2. For any finite set 5 C S* the equivalence relation TZ partitions the elements of S into at 

most (maxa;g 5 classes. 

Definition 2. Let P C E* be a set and let Q C S* x N be a parameterized problem. We say 
that L cross-composes into Q if there is a polynomial equivalence relation TZ and an algorithm 
that, given t strings xi, X 2 ,..., x* belonging to the same equivalence class of TZ, computes an 
instance (x*, k*) G E* x N in time polynomial in 1^*1 such that: 

1. (x*, k*) £ Q Xi £ L ioi some i G [t], 

2. k* is bounded by a polynomial in maxjgj^] \xi\ -\- logt. 

Theorem 1 ([7]). If the set L C E* is NP-hard under Karp reductions and L cross-composes 
into the parameterized problem Q, then there is no polynomial kernel or compression for Q 
unless NP C coNP/poly. 

Graphs. All graphs in this work are simple, undirected, and finite. For a finite set X and 
positive integer n, we denote by the family of size-n subsets of X. The set {1,... ,n} is 
abbreviated as [n]. An undirected graph G consists of a vertex set V{G) and edge set E{G) C 
a set A C V{G) we use G[X] to denote the snbgraph of G induced by X. We 
use G — X as a shorthand for G\y{G) \ X]. For v £ V{G) we use Ng{v) to denote the open 
neighborhood of v. For X C V{G) we define Ng{X) := Hciv) \ X. The boundary of X 
in G, denoted dciX), is the set of vertices in X that have a neighbor in V{G) \ X. 
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Treewidth and protrusion decompositions. A tree decomposition of a graph G is a 
pair (T, A), where T is a tree and (A = {Xi \ i £ V{T)}) is a family of subsets of V{G) called 
bags, such that (i) Uie\/(T) (G*), (h) for each edge {«, v} G E{G) there is a node i gV (T) 

with {u,v} C Xi, and (iii) for each v G V{G) the nodes {i\v G Xi} induce a connected subtree 
of T. The width of the tree decomposition is maxjgy('r) \Xi\ — 1. The treewidth of a graph G, 
denoted tw(G), is the minimum width over all tree decompositions of G. An optimal tree de¬ 
composition of an n-vertex graph G can be computed in time ^n) using Bodlaender’s 

algorithm [3]. A 5-approximation to treewidth can be computed in time using 

the recent algorithm of Bodlaender et al. [6]. A vertex set X such that tw(G — X) < t is called 
a treewidth-t modulator. 

For a positive integer r, an r-protrusion in a graph G is a vertex set X C V{G) such 
that tw(G[X]) < r — 1 and dciX) < r. An (a, r)-protrusion decomposition of a graph G is a 
partition "P = To bJ Fi U ... Uof F(G) such that (1) for every 1 < z < ^ we have NciYi) C Iq, 
(2) max(£, |Fo|) < and (3) for every 1 < i < (. the set F* U NciYi) is an r-protrusion in G. 
We sometimes refer to To as the shared part. 


3 ILPs of bounded treewidth 


We analyze the influence of treewidth for preprocessing ilpf. In Section 3.1 we give formal def¬ 
initions to capture the treewidth of an ILP, and introduce a special type of tree decompositions 
to solve ILPs efficiently. In Section [3.2| we study the parameterized complexity of ilpf param¬ 
eterized by treewidth. Tractability turns out to depend on the domain of the variables. An 
instance {A, b) of ilpf has domain size d if, for every variable Xi , there are constraints —Xi < d' 
and Xi < d" for some d' > 0 and d” < d — 1. (All positive results work also under more 
relaxed definitions of domain size d, e.g., any choice of d integers for each variable, at the cost 
of technical complication.) The feasibility of bounded-treewidth, bounded-domain ILPs is used 
in Section |3.3| to formulate a protrusion replacement rule. It allows the number of variables in 
an ILP of domain size d that is decomposed by a (A:, r)-protrusion decomposition to be reduced 


to 0{k ■ r ■ dJ'). In Section 3.4 we discuss limitations of the protrusion-replacement approach. 


3.1 Tree decompositions of linear programs 

Given a constraint matrix A G we define the corresponding Gaifman graph G = G{A) as 

follows [ini Chapter 11]. We let F(G) = {xi,... ,Xn}, i.e., the variables in Ax < b for b G Z”^. 
We let {xi,Xj} G E{G) if and only if there is an r G [m] with A[r, z] ^ 0 and A[r,j] ^ 0. 
Intuitively, two vertices are adjacent if the corresponding variables occur together in some 
constraint. 

Observation 1. For every row r of A G Z™'^"', the variables W with nonzero coefficients in 
row r form a clique in G{A). Consequently (cf. [3]), any tree decomposition (T, A) of G{A) has 
a node i with Yj. ^ Xi. 

To simplify the description of our dynamic programming procedure, we will restrict the 
form of the tree decompositions that the algorithm is applied to. This is common practice when 
dealing with graphs of bounded treewidth: one works with nice tree decompositions consisting 
of leaf, join, forget, and introduce nodes. When using dynamic programming to solve ILPs it 
will be convenient to have another type of node, the constraint node, to connect the structure 
of the Gaifman graph to the constraints in the ILP. To this end, we define the notion of a nice 
Gaifman decomposition including constraint nodes. 
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Definition 3. Let A G A nice Gaifman decomposition of A of width w is a triple 

{T,X = {Xi I i G V{T)},Z = {Zi I i G V{T)}), where T is a rooted tree and {T,X) is a 
width w tree decomposition of the Gaifman graph G{A) with: 

(1) The tree T has at most 4n + m nodes. 

(2) Every row of A is assigned to exactly one node of T. If row j is mapped to node i then Zi 
is a list of pointers to the nonzero coefficients in row j. 

(3) Every node i of T has one of the following types: 

leaf: i has no children and \Xi\ = 1, 

join: i has exactly two children j, j' and Xi = Xj = Xj/, 

introduce: i has exactly one child j and Xi = Xj U {u} with v G V{G{A)) \ Xj, 

forget: i has exactly one child j and Xi = Xj \ {?;} with v G V{G{A)) n Xj, 

constraint: i has exactly one child j, Xi = Xj, and Zi stores a constraint of A involving 
variables that are all contained in Xi. 

The following proposition shows how to construct the Gaifman graph G{A) for a given 
matrix A. It will be used in later proofs. 

Proposition 1. Given a matrix A G Z™^"" in which each row contains at most r nonzero 
entries, the n x n adjacency matrix of G{A) can be construeted in 0{nm + + n^) time. 

Proof. Initialize an all-zero n x n adjacency matrix M in 0{n^) time. Scan through A to 
collect the indices of the non-zero entries in each row in 0{nm) time. For each row r, for each 
of the O(r^) pairs A[r,i],A[r,j] of distinct nonzero entries in the row, set the corresponding 
entries M[i,j], M[j,i\ of the adjacency matrix to one. □ 

We show how to obtain a nice Gaifman decomposition for a matrix A of width w from any 
tree decomposition of its Gaifman graph G{A) of width w. 

Proposition 2. There is an algorithm that, given A G Z™'^"' and a width-w tree decomposi¬ 
tion {T,X) of the Gaifman graph of A, eomputes a nice Gaifman decomposition {T',X',Z') 
of A having width w in 0{w'^ ■ 114(7")! + n ■ m ■ w) time. 

Proof. Building a nice tree decomposition. From the tree decomposition {T,X) of G{A) we can 
derive a chordal supergraph of G{A) with maximum clique size bounded by rc-l-l, by completing 
the vertices of each bag into a clique m Lemma 2.1.1]. This can be done in 0{w‘^ ■ |I4(T)|) 
time by scanning through the contents of the bags of (T, X). A perfect elimination order of the 
chordal supergraph can be used to obtain a nice tree decomposition {T',X') of G{A) having 
width w on at most 4n nodes m Lemma 13.1.3]. The nice tree decomposition consists of leaf, 
join, introduce, and forget nodes. 

Incorporating constraint nodes. We augment the nice tree decomposition with constraint 
nodes to obtain a nice Gaifman decomposition of A, as follows. We scan through matrix A and 
store, for each row, a list of pointers to the nonzero entries in that row. This takes 0{n-m) time. 
Since a graph of treewidth w does not have cliques of size more than tc -|- 1, by Observation 
each row of A has at most w 1 nonzero entries. We maintain a list of the rows in A that 
have not yet been associated to a constraint bag in the Gaifman decomposition. We traverse 
the rooted tree T' in post-order. For each node i, we inspect the corresponding bag Xi and 
test, for each constraint that is not yet represented by the decomposition, whether all variables 
involved in the constraint are contained in the bag. This can be determined in 0{w) time per 
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constraint as follows. For each variable in Xi we test whether the corresponding row in A has a 
nonzero coefficient for that variable; if so, we increase a counter. If the final value of the counter 
matches the precomputed number of nonzero coefficients in the row then the bag contains all 
variables involved in the constraint. In that case we update the tree T' as follows: we make a 
new node assign Xj/ := Xi, and let Z', be a copy of the precomputed list of pointers to the 
nonzero coefficients in the constraint. We make i' the parent of h If z is not the root, then it 
originally had a parent j; we make j the parent of i' instead. This operation effectively splices 
a node of degree two into the tree. Since the newly introduced node has the same bag as i, 
the relation between the bags of parents and children for the existing nodes of the tree remains 
unaltered (e.g., a forget node in the old tree will be a forget node in the new tree). The newly 
introduced node i' satisfies the requirements of a constraint node. We then continue processing 
the remainder of the tree to obtain the final nice Gaifman decomposition [T', X', Z'). As the 
original tree contains 0{n) nodes, while we spend 0{m ■ w) time per node to incorporate the 
constraint bags, this phase of the algorithm takes 0{n ■ m ■ w) time. By Observation for 
each constraint of A the involved variables occur together in some bag. Hence we will detect 
such a bag in the procedure, which results in a constraint node for the row. Since the nice tree 
decomposition that we started from contained at most 4n nodes, while we introduce one node for 
each constraint in A, the resulting tree has at most 4n + m nodes. This shows that {T', X', Z') 
satisfies all properties of a nice Gaifman decomposition and concludes the proof. □ 

3.2 Feasibility on Gaifman graphs of bounded treewidth 

We discuss the influence of treewidth on the complexity of ilpf. It turns out that for unbounded 
domain variables the problem remains weakly NP-hard on instances with Gaifman graphs of 
treewidth at most two (Theorem]^. On the other hand, the problem can be solved by a simple 
dynamic programming algorithm with runtime 0*{dX^^), where d is the domain size and w 
denotes the width of a given tree decomposition of the Gaifman graph (Theorem]^. In other 
words, the problem is fixed-parameter tractable in terms of d + w, and efficiently solvable for 
bounded treewidth and d polynomially bounded in the input size. 

Both results are not hard to prove and fixed-parameter tractability of iLPF(d -|- w) can also 
be derived from Courcelle’s theorem (cf. |10t Gorollary 11.43]). Nevertheless, for the sake of self¬ 
containment and concrete runtime bounds we provide direct proofs. Theorem]^ is a subroutine 
of our protrusion reduction algorithm. 

Theorem 2. ilp feasibility remains weakly HP-hard when restricted to instances [A, b) whose 
Gaifman graph G{A) has treewidth two. 

Proof. We give a straightforward reduction from subset sum to this restricted variant of ilpf. 
Recall that an instance of subset sum consists of a set {oi,..., On} of integers and a target 
value 6 G Z; the task is to determine whether some subset of the n integers sums to exactly b. 
Given such an instance ({ai,..., On}, b) we create n variables xi,... ,Xn that encode the selection 
of a subset and n variables yi,... ,yn that effectively store partial sums; the Xi variables are 
constrained to domain {0,1}. Concretely, we aim to compute 

j 

Vj — ^ ^ OjiXi 
i=l 

for all J G {1,..., n}. Clearly, this is correctly enforced by the following constraints. 


2/1 = 

Vj ~ 1 
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j G {2,...,n} 


Finally, we enforce yn = b. Clearly, a subset of the n integers with sum b translates canonically 
to a feasible assignment of the variables, and vice versa. 

It remains to check the treewidth of the corresponding Gaifman graph. We note that 
for this purpose it is not necessary to split equalities into inequalities or performing similar 
normalizations since it does not affect whether sets of variables occur in at least one shared 
constraint. Thus, we can make a tree decomposition (in fact, a path decomposition) consisting 
of a path on nodes 1,..., n with bags Xi,..., where Xi = {xi, yi} and Xj = {xj^yj-i,yj} 
for j > 2. Clearly, for each constraint there is a bag containing all its variables, correctly 
handling all edges of the Gaifman graph, and the bags containing any variable form a connected 
subtree. It follows that the Gaifman graph of the constructed instance has treewidth at most 
two. □ 

Theorem 3. Instances {A G 'IX'^^,b) of ilpf of domain size d with a given nice Gaifman 
decomposition of width w can be solved in time 0{dA^^ ■ w ■ {n + m)). 

Proof. Let {A,b) denote an instance of ilpf of domain size d and let {T,X,Z) denote a given 

nice Gaifman decomposition of width w for A. We describe a simple dynamic programming 
algorithm for testing whether there exists an integer vector x such that Ax < b. For ease of 
presentation we assume that each domain is {0,..., d — 1}; it is straightforward, but technical, 
to use arbitrary (possibly different) domains of at most d values for each variable. 

With a node i, apart from its bag Xi, we associate the set Vi of all variables appearing in Xi 
or in the bag Xj of any descendant j of i. By Ci we denote all constraints p (rows of A) that 
the Gaifman decomposition maps to a descendant of i (including i itself). 

Our goal is to compute for each node i the set of all feasible assignments to the variables 
in Xi when taking into account only constraints in Ci. The set of feasible assignments for Xi 
will be recorded in a table Fi indexed by tuples {0,... ,(i — 1}^ where £ = \Xi\. The entry 

at Fi{ai,..., ae) corresponds to assigning xq = ai,..., x*^ = ai where Xi = {xq,..., Xj^} 

and ii < ... < i£. Its value will be 1 if we determined that there is a feasible assignment for V) 
with respect to constraints Ci that extends xq = ai,...,xq = af, otherwise the value is 0. 
We will now describe how to compute the tables F) in a bottom-up manner, by outlining the 
behavior on each node type. 

• Leaf node i with bag Xi = {x^}. Since Cj = 0 as i has no children, we simply have Fj(a) = 1 
for all a G {0 ,... ,d — 1} which is computed in 0{d) time. 

• Forget node i with child j and bag Xi = Xj \ {xg}. It suffices to project Fj down to only 
contain information about the feasible assignments for Xi = Xj \ {xq}. To make this 
concrete, let Xj = {xq,...,xq} with £ = \Xj\ = \Xi\ + 1 and Xq = xq. Thus, Xi = 

We let 

; ^5—1? ^5+1? • • • — max • •., ti, (Xg-i-i,..., 

aG{0,.--,c?—1} 

for all (oi,..., tts-i, a^+i,..., a^) G {0,..., d— i.e., we set Fj(...) to I if some choice 

of a extends the assignment to one that is feasible for Xj with respect to all constraints 
onVj = Vi; else it takes value 0. Each table entry takes time 0{d) and in total we spend 
time 0{d ■ C 0{d^~^^); note that Xi = Xj \ {xq} implies \Xi\ < w. 

• Introduce node i with child j and bag Xi = XjU{xq}. Let Xi = {xq,..., xq} with £ = \Xi\ 
and Xq = Xq, implying that Xj = {xq,..., a:q_,^, xq_^j,..., xq}. As i is an introduce node 
we have Ci = Cj, and therefore Ci does not constrain the value of Xq in any way. We 
set Fi as follows. 

Fi (ui ,..., af) — Fj (cii, •.., Us— 1 , o-s+i) • • •) Ur)) 



for all (ai,...,a^) G {0,...,d — We use time C C>(d"'+^). 

• Join node i with children j,j' and bag Xi = Xj = Xj/. At a join node i, from child j 
we get all assignments that are feasible for Xi = Xj regarding constraints Cj, and from 
child j' we get the feasible assignments for constraints Cj>. We have Cj = Cj U Cj/, and 
therefore an assignment is feasible for Cj if and only if it is feasible for both Cj and Cj/. 

It suffices to merge the information of the two child nodes. Letting Xj = {xj^,... ,Xj^} 
with I = |Xj|, we set 

Fj(ai, ...,ai) = min{Fj(ai,..., o^), Fj/(ai,... , 0 ^)}, 
for all (ai,...,a^) G {0,...,d — This takes time 0(dl^*l) C C)(d"'+^). 

• Constraint node i with child j mapped to row p. Let Xj = {xj^,... ,Xj^} with i = |Xj|. 
We know that Cj = Cj U {p} and therefore an assignment of values to Xj is feasible with 
respect to the constraints Cj if and only if it is feasible for the constraints Cj and also 
satisfies constraint p. We therefore initialize by setting Fi as follows. 

Cj (oi,..., Q.^) — Fj (cii ,..., ai'), 

for all (ui ,... ,ai) G {0,..., d—Now, we need to discard those assignments (oi,..., a^) 
that are not feasible for the additional constraint p. For each (ai,..., a^) with Fj(ai,..., a^) = 
1 we process the row p. Using the pointers to the nonzero coefficients in row p that are 
stored in Zi , along with the fact that all variables constrained by p are contained in bag Xj 
by Definition|^ we can evaluate the constraint in 0{w) time. If the sum of values times co¬ 
efficients exceeds h\p\ then the constraint is not satished and we set Fj(ai,..., a^) to 0. This 
takes time C(|Xj|) C 0{w) per assignment. In total we use time 0{w ■ C 0{d'^~^^w). 

At the end of the computation we have the table Fj. where r denotes the root of the tree 
decomposition (T, X ). By definition, it encodes all assignments to X^ that can be extended to 
assignments that are feasible for all constraints in C^. By Definition the set Cr for the root 
contains all constraints of A and thus any entry Fr{ai ,..., a\Xr\) — ^ implies that Ax < b has 
an integer solution. Conversely, any integer solution must lead to a nonzero entry in F^. By 
Definition 1^ the number of nodes in T is 0{n + m). The total time needed for the dynamic 
programming is therefore bounded by ■ w ■ {n + m)). □ 

If a nice Gaifman decomposition is not given, one can be computed by combining an algo¬ 
rithm for computing tree decompositions with Proposition]^ 

3.3 Protrusion reductions 

To formulate the protrusion replacement rule, which is the main algorithmic asset used in this 
section, we need some terminology. For a non-negative integer r, an r-boundaried ILP is an 
instance {A, b) of ilpf in which r distinct boundary variables xt ^,..., xt^ are distinguished among 
the total variable set {xi,..., x^}. If U = (xj^,..., Xj^) is a sequence of variables of Ax < b, 
we will also use {A, b, Y) to denote the corresponding r-boundaried ILP. The feasible boundary 
assignments of a boundaried ILP are those assignments to the boundary variables that can be 
completed into an assignment that is feasible for the entire system. 

Definition 4. Two r-boundaried ILPs (A, 6, Xjj,..., xt^) and (A', 6', x'^,,... ,x[,) are equivalent 
if they have the same feasible boundary assignments. 
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The following lemma shows how to compute equivalent boundaried ILPs for any boundaried 
input ILP. The replacement system is built by adding, for each infeasible boundary assignment, 
a set of constraints on auxiliary variables that explicitly blocks that assignment. 

Lemma 1. There is an algorithm with the following specifications: (1) It gets as input an r- 
boundaried ILP ■ ■ ■ ,xt^) with domain size d, with A G b G Z™, and a width-w 

nice Gaifman decomposition (T,X,Z) of A. (2) Given such an input it takes time 0{dl ■ 
{dP'^^w{n + m)) + • dfi''^). (3) Its output is an equivalent r-boundaried ILP (A', 6', x'^,,... ,x[,) 

of domain size d containing 0{r ■ dl) variables and constraints, and all entries of {A',b') 
in {—d,... ,d}. 

Proof. The lemma is a combination of two ingredients. Using Theoremwe can efficiently test 
whether a given assignment to the boundary variables can be extended to a feasible solution 
for {A,b). Then, knowing the set of all assignments that can be feasibly extended, we block 
each infeasible partial assignment by introducing a small number of variables and constraints, 
allowing us to fully discard the original constraints and non-boundary variables. The latter step 
uses a construction from an earlier paper by Kratsch m Theorem 5], which we repeat here for 
completeness. 

Finding fea.sible partial assignments. Consider a partial assignment = oi,..., 
to the boundary variables with a* G {0,...,(i — 1}. To determine whether this assignment 
can be extended to a feasible assignment for {A,b), we enforce these equalities in the ILP. 
Concretely, for each i we replace the domain-bounding constraints for xt^ in the system {A, b) 
by —Xj > Oj and Xj < a*. We obtain a new system {A,b) with m constraints. Since the 
modified constraints involve only a single variable each, the modifications do not affect the 
Gaifman graph: G{A) = G{A). Moreover, the modifications do not affect which entries in the 
constraint matrix have nonzero values, implying that (T, X, Z) also serves as a nice Gaifman 
decomposition of A. The partial assignment to the boundary variables can be feasibly extended 
if and only if {A, b) is feasible. We may invoke the algorithm of Theorem with (T, X, Z) to 
decide feasibility in 0{d'^~^^w{n-G'm)) time. By iterating over all dl possible partial assignments 
to the boundary variables with values in {0,..., d — 1}, we determine which partial assignments 
can be feasibly extended. Let £ be a list of the partial assignments that can not be extended 
to a feasible solution for [A, b). 

Blocking infeasible partial assignments. Using L we construct an equivalent r-boundaried 
ILP [A', b', xj^,..., x'l^) as follows. Based on the length of C we can determine the number of 
variables that will be used in (^4', b'), which helps to write down the constraint matrix efficiently. 
The number of variables in the new system will be r -|- 2r|£|, the number of constraints will 
be 2r -|- (6r -I- 1)|£|. The system is built as follows. For each boundary variable xt- of {A,b) 
we introduce a corresponding variable x'., in {X, b') and constrain its domain to {0,..., d — 1} 
using two inequalities; this yields r variables and 2r constraints. 

For each infeasible partial assignment = {a{,... ,ai) in the list £, we add new variables X- 
and vl for all i G [r], together with the following constraints: 



Vi G [r] : 


2 = 1 


We claim that an assignment to the boundary variables can be extended to the newly 


introduced variables to satisfy the constraints for a^ if and only if the partial assignment is not . 
In the first direction, assume that (x ^^,... ,x^fi = {a\,..., ai). Then 0 = x^, — = ul — d ■ vl, 
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implying that = vj = 0 (taking into account the domains of and vj) for all i. Therefore 
constraint Q is violated which shows that the partial assignment can not be feasibly extended. 
In the other direction, if ..., 7 ^ {a {,..., ai), then there is a position i with / a^. 

It follows that 0 < \x^.—al\ < d (due to the domain of x'^.) which in turn implies that / 0 
since the contribution of d ■ vj to the equality Q is a multiple of d. Therefore constraint ^ is 
fulfilled. 

The only coefficients used in the constraints that block a partial assignment are {—d,—1,0,1, d} 
as the equalities of 0 are represented using two inequalities, we get coefficients +d and —d. The 
values of aj, which appear in the right-hand side vector 6 ', are in {—(d— 1 ),..., 0 ,..., d— 1 } since 
they arise from the coordinates of an attempted partial assignment (with negative values from 
representing equalities by inequalities). As the coefficients of the domain-enforcing constraints 
are all plus or minus one, with right-hand side in { —(d— 1 ),..., 0 ,..., d— 1 }, the structure of the 
constructed system {A, h') matches that described in the lemma statement. For each infeasible 
partial assignment we introduce 2 r variables with 2 domain-enforcing constraints each, along 
with 2 r inequalities to express the r equalities of and a single constraint for Q. The total 
system therefore has r-|- 2 r|T| variables and 2 r-|- ( 6 r-|- 1)|>C| constraints. Since there are only d^ 
partial assignments that we check, we have \C\ < dJ" and therefore the system has 0{r ■ d^) 
variables and constraints. Consequently, the constraint matrix A' has 0(r^ • d^^) entries and it 
is not hard to verify that it can be written down in linear time. It remains to prove that the 
r-boundaried ILP {A', 6 ', ,..., ) is equivalent to the input structure. 

Consider an assignment to the boundary variables. If the assignment can be extended 
to a feasible assignment for {A,b), then the boundary variables take values in {0,... ,d — 1} 
(since {A,b) has domain size d) and therefore satisfy the domain restrictions of {A',b'). Since 
the partial assignment has a feasible extension, it is not on the list C. For each set of constraints 
that was added to block an infeasible partial assignment , the claim above therefore shows 
that the related variables ul and vj can be set to satisfy their constraints. Hence the partial 
assignment can be extended to a feasible assignment for [A', b'). In the reverse direction, suppose 
that a partial assignment can be feasibly extended for [A',b'). By the claim above, the partial 
assignment differs from each of the blocked points on C. Since C contains all infeasible partial 
assignments with values in {0,..., d — 1 }^, and feasible partial assignments for {A', b') belong 
to { 0 ,..., d — 1 }'’ since we restricted the domain oi x',, x'.,, there is an extension feasible 
for (A, b). This shows that the two r-boundaried ILPs are indeed equivalent, and concludes the 
proof. □ 

Intuitively, we can simplify an ilpf instance {A, b) with a given protrusion decomposition 
by replacing all protrusions with equivalent boundaried ILPs of small size via Lemma We 
get a new instance containing all replacement constraints plus all original constraints that are 
fully contained in the shared part. 

Theorem 4. For each constant r there is an algorithm that, given an instance {A, b) of ilpf 
with domain size d, along with a {k, r)-protrusion decomposition Tq U li U ... U Yf of the given 
Gaifman graph G{A), outputs an equivalent instance {A', b') of ilpf with domain size d on 0{k- 
d'') variables in time 0{n ■ m -|- -\-m) + k ■ m ■ dl' + k"^ ■ d^'’). Each constraint of (A', b') is 

either a constraint in {A,b) involving only variables from Yq, or one ofO{k-dJ') new constraints 
with coefficients and right-hand side among {—d,... ,d}. 

Proof. The main idea of the proof is to apply Lemma[^to replace each protrusion in the Gaifman 
graph by a small subsystem that is equivalent with respect to the boundary variables. For the 
sake of efficiency, we start by scanning through A once to compute for each row of A a list 
of pointers to the nonzero coefficients in that row. This takes 0{n ■ m) time. We handle the 
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protrusions Yi for i G [P\ (this implies i > 1) consecutively, iteratively replacing each protrusion 
by a small equivalent boundaried ILP to obtain an equivalent instance. 

Replacing protrusions. Consider some Yi with i > 1; we show how to replace the variables Yi 
by a small subsystem that has the same effect on the existence of global solutions. The dehnition 
of protrusion decomposition ensures that NQ(^j\-^{Yi) C Yq and that Gi := G{A)[Yi U 
has treewidth at most r — 1. From the system {A,b) we extract the constraints involving at 
least one variable in Yi. We collect these constraints, together with domain-enforcing constraints 
for n Yq, into a subsystem {Ai,bi). Let n* and rui be the number of variables and 

constraints in {Ai, bi), respectively. Since \NQi^j^-j(Yi) H IqI < by the dehnition of a protrusion 
decomposition, we have Uj < \Yi\ + r. Since the nonzero variables involved in a constraint 
induce a clique in the Gaifman graph, while Gi has treewidth at most r — 1 and therefore does 
not have cliques of size more than r, it follows that a constraint involving a variable from Yi 
acts on at most r variables. We can therefore identify the constraints involving a variable in Yi 
by only inspecting the rows of A containing at most r nonzero entries, which implies that the 
system {Ai, bi) can be extracted in 0{m ■ r + m ■ mi) C 0{m + Ui ■ mi) time. 

Let {xti 1 ,... be the neighbors of Yi in G{A), i.e., the variables of Yq that appear 

in a common constraint with a variable in Yi. As r is a constant, we can compute a tree 
decomposition (Tj, Tj = {Xi^j \ j G V{Ti)}) of Gi of width r — 1 with 0{ni) bags in 0{ni) 
time HE]. Using Proposition this yields a nice Gaifman decomposition {Tl,X!,Z'-) of Ai 
in 0{ni ■ mi) time. Interpreting (Aj, 6 *, xt-j,..., ^,) as an r'-boundaried ILP, we invoke 

Lemma ni to compute an equivalent r'-boundaried ILP {A'-, b\,x'., ,..., x'., ) in 0{(P^{ni + mi)) 

\ \ ^'1 i.r^ 

time for constant r. By Lemma the numbers in the system {A{,b'i) are restricted to the 
set {—d,...,d} and {A[,b{) has 0{dJ') C 0{dJ') variables and constraints. We modify the 
instance {A, b) as follows, while preserving the fact that it has domain size d. We remove all 
variables from Yi and all constraints involving them from the system {A,b). For each non¬ 
boundary variable in (A', b[) we add a corresponding new variable to (A, b). For each constraint 
in (A(, 6'), containing some boundary variables and some non-boundary variables, we add a new 
constraint with the same coefficients and right-hand side to (A, b). All occurrences of boundary 
variables x', of {A{,b'i) are replaced by the corresponding existing variables xt^, of (A, 6); 
occurrences of non-boundary variables are replaced by occurrences of the corresponding newly 
introduced variables. 

Observe that these replacements preserve the variable set Yq, and that the newly introduced 
constraints only involve Yq and newly introduced variables. We can therefore perform this re¬ 
placement step independently for each protrusion Yi with i G [P\. Since each variable set Yi 
for i G \i] is removed and replaced by a new set of 0{dJ') variables, the final system {A',b') 
resulting from these replacements has 0 (|Yo| +i ■ dJ') variables, which is 0{k ■ dJ') since the defi¬ 
nition of a {k, r)-protrusion decomposition ensures that max(t', |Yb|) < k. When building (A', b'), 
the procedure above removes from (A, b) all constraints that involve at least one variable in Yi 
with i > 1 . Hence the only constraints in {A',b') are (1) the constraints of {A,b) that only 
involve variables in Yq, and (2) the 0{i ■ dJ') C 0{k ■ dk") new constraints that are copied from 
subsystems {A!^,h{) for i G [£]. Hence the constraints of the constructed instance satisfy the 
claims in the theorem statement. 

Running time. Let us consider the running time of the procedure. For each i G [^] the time 
to extract the subsystem {Ai,bi) and compute an equivalent r-boundaried ILP is dominated 
by 0{m + ni-mi + d‘^^{ni + mi)), using the fact that r is a constant. As we observed earlier, n, < 
\Yi\ + r. Each constraint in {Ai,bi) is either a constraint of A involving a variable in Yi, or a 
domain-enforcing constraint on NQ(^j^'^{Yi) n Yq; there are at most 2r of the latter kind. The 
constraints of A involving a variable in Yi do not occur in systems {Aii,bi/) for i' ^ i, since 
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this would imply that a variable in Yi is adjacent in G{A) to a variable in 1^/ for i ^ i' ^ 0, 
contradicting the definition of a protrusion decomposition. Since all but 2r of the constraints 
of {Ai,bi) correspond to constraints of ^4, which only occur once, it follows that — 

m+i-2r < m+n-2r < m+m-2r G 0{m), using the fact that i < nhy the dehnition of protrusion 
decomposition and n < m since {A, b) contains a domain-enforcing constraint for each variable. 
Similarly we have ^ G{n), which shows that ' ^i) — (Z]i=i ' (Si=i ^ 

0{n ■ m). From this it follows that, summing the total running time of computing replacements 
for all protrusions Yi with i £ [r], we obtain the bound C){m + rii ■ rrii + d?^{ni + rrii)) C 

0{n ■ m + d^''{n + m)), using again that £ • m < n ■ m. After all replacements have been 
computed, the time to construct the output instance {A',h') is dominated by the total size of 
the resulting matrix A'. Let n! and m' be the number of variables and constraints in A'x' < b'. 
The argumentation above shows that n! £ 0{k ■ cF). Each constraint in A' that is not in A 
originates from a subsystem (A[,b'i) which has 0{(F) constraints. Hence m' £ 0{m -t- (. ■ (T). 
The number of entries in A' is therefore 0{{k ■ dk) • (m + i ■ d’’)). Using the fact that I < k 
by the definition of a protrusion decomposition, this simplifies to 0{k ■ m - + k'^ ■ which 

also bounds the time to write down the system {A', b'). (We remark that these bounds can be 
improved using a sparse matrix representation.) The total running time of the procedure is 
therefore 0{n ■ m + d'^''{n + m) + k ■ m ■ dT + k'^ ■ d^'’). Having proven the running time bound, 
it remains to show that the two instances of ilpf are equivalent. 

Claim 1 . There is an integer vector x satisfying Ax < b if and only if there is an integer 
vector x' satisfying A'x' < b'. 

Proof. (=i>) In the first direction, assume that Ax < b. Then there is a partial assignment xyq 
of values to the variables Yq that can be extended to a feasible solution for {A,b), as x is 
such an extension. Since all variables Yq also exist in {A',b') we can consider xyq as a par¬ 
tial assignment of variables for {A',b'). We prove that this partial assignment xyq can be 
extended to a feasible assignment for {A', b'). To see this, observe that all constraints in (A', b') 
involving only variables of Yq also exist in (A, 6 ), and are therefore satisfied by the partial as¬ 
signment. All constraints involving at least one variable from Yi for i > 1 were removed and 
replaced by constraints from a subsystem {A'-,b'j). Consider a subsystem (A(, 6 () whose con¬ 
straints and variables were introduced into (A',b'). Letting r' := |A"G(A)(^j) CTb|, Lemma H 
guarantees that the r'-boundaried ILP (A*, bi, NQ(^^-^{Yi) HYq) is equivalent to the r'-boundariea 
ILP {A'^,b'i, NQi^y^'^iYi) n Iq)- Since Ax < b, and {Ai,bi) is a subsystem of (A, 6 ), the partial 
assignment of xyq to the variables in NQ(^jY^{Yi) n Yq can be extended to a feasible solution 
of (Aj, bi). By dehnition of equivalence, this shows that the partial assignment can be extended 
to a feasible solution for (A', 6 ', implying that the new variables that were added 

to {A',b') originating from (A', 6 ') can be assigned values from {0, ...,(i — 1} to satisfy the 
constraints in (A), 6 '). Since these are the only constraints involving the variables that were 
added from (A', 6 '), we can independently assign values to the new variables introduced by the 
subsystems (A', 6 ') for i £ \l] to obtain a feasible assignment for {A',b'). 

(-t=) For the reverse direction, assume that there is an integer vector x' such that A'x' < b' 
and consider the corresponding partial assignment to the variables To- We will prove 
that x'y^ can be extended to a feasible assignment for (A, 6 ). As {A',b') contains all constraints 
of (A, b) whose variable sets are contained in Yq, we only have to show that x'y^ can be extended 
to the variables li,..., while satisfying all constraints involving at least one variable not 
in Yq. Observe that for each variable Xj 0 Yq, all variables that occur in a common constraint 
with Xj are neighbors of Xj in G{A). Consequently, if we choose i £ [1] such that Xj £ Yi., then 
all variables that occur in a constraint with Xj are included in the set Yi U n Yq), 

which implies that all constraints involving Xj are included in the subsystem (Aj, 6 j). Note 
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that {A',b') contains the boundaried ILP n Yq) which is equivalent to the 

boundaried ILP {Ai, h, H Yq) that is included in {A, b). As Xy^ can be extended to a 

feasible assignment for {A', b'), this partial assignment can be extended to a feasible assignment 
for {A[,b^,NQ(^j^'^(Yi) n Yq), which implies by the definition of equivalence that it can also be 
extended to a feasible assignment for (Aj, n Iq)) which includes all variables in Yi 

and all constraints involving at least one variable of Yi. Hence for each i G [^] we can extend x'y^ 
to the variables of Yi while satisfying the constraints in (Aj, bi). These extensions can be done 
independently since each variable is present in only one set Yi. As this shows that x'y^ can be 
extended to the remaining variables of (A, h) to satisfy all constraints that involve at least one 
variable not in Yq) h follows that (A, 6) has a feasible integer solution by the argument given 
above. This shows that (A, 5) and {A',b') are equivalent. □ 

This concludes the proof of Theorem □ 

3.4 Limitations for replacing protrusions 

In this section, we discuss limitations regarding the replacement of protrusions in an ILP. 
First of all, there is an information-theoretic limitation for the worst-case size replacement of 
any r-boundaried ILP with variables ,..., xt^ each with domain size d. Clearly, there are 
different assignments to the boundary. For any set A of assignments to the boundary variables, 
using auxiliary variables and constraints one can construct an r-boundaried ILP whose feasible 
boundary assignments are exactly A. This gives a lower bound of d’' bits for the encoding size 
of a general r-boundaried ILP, since we have 2'^’^ subsets. Our first result regarding limitations 
for replacing protrusions is that this lower bound even holds for boundaried ILPs of bounded 
treewidth. 

Proposition 3. For any d, r G N and A C {0, ...,d — 1}^ there is an r-boundaried ILP of 
treewidth 3r with domain size d, whose feasible boundary assignments are A. 

The proposition follows from the fact that the encoding in Lemma produces a boundaried 
ILP of treewidth at most 3r. To find an r-boundaried ILP of small treewidth whose feasible 
assignments are A, we may therefore hrst construct an arbitrary r-boundaried ILP whose feasible 
boundary assignments are A, and then invoke Lemma Our used encoding in Lemma uses 
size 0(d^'’). Note that, when using an encoding for sparse matrices, our replacement size comes 
fairly close to the information-theoretic lower bound. 

Second, the lower bounds for 0/l-iLPF(n), which follow, e.g., from lower bounds for hitting 
SET parameterized by ground set size, imply that there is no hope for a kernelization just in 
terms of deletion distance to a system of bounded treewidth. (This distance is upper bounded by 
n.) Note that the bound relies on a fairly direct formulation of hitting set instances as ILPs, 
which creates huge cliques in the Gaifman graph when expressing sets as large inequalities (over 
indicator variables). The lower bound can be strengthened somewhat by instead representing 
sets less directly: For each set, “compute” the sum of its indicator variables using auxiliary 
variables for partial sums. Similarly to the example of subset sum, this creates a structure of 
bounded treewidth. Note, however, that this structure is not a (useful) protrusion because its 
boundary can be as large as n; this is indeed the crux of having only a modulator to bounded 
treewidth but no guarantee for (or means of proving of) the existence of protrusions with small 
boundaries. 

Finally, we prove in the following theorem that the mentioned information-theoretic limi¬ 
tation also affects the possibility of strong preprocessing, rather than being an artifact of the 
definition of equivalent boundaried ILPs. In other words, there is a family of ilpf instances 
that already come with a protrusion decomposition, and with a single variable of large domain, 
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but that cannot be reduced to size polynomial in the parameters of this decomposition. Note 
that this includes all other ways of handling these instances, which establishes that protrusions 
with even a single large domain boundary variable can be the crucial obstruction from achieving 
a polynomial kernelization. 

Theorem 5. Assuming NP ^ coNP/poly, there is no polynomial-time algorithm that compresses 
instances {A G b G Z”^) of ilpf with entries in {—n,..., n} that consist o/{0, l}-variables 

except for a single variable of domain d < n, which are given together with a {k, 5)-protrusion 
decomposition Yq kJ hi U ... U of V{G{A)), to size polynomial in k -\- m -\- logd, where rh is 
the number of constraints that affect only variables 0 /Yq. 

Proof. We prove the theorem by giving a cross-composition from independent set to an 
appropriate instance of ilpf. We begin by picking a polynomial equivalence relation TZ on 
instances (G, k) of independent set. (Recall that such an instance asks whether G contains an 
independent set of size at least k.) We let all malformed instances be equivalent, including those 
where k > |R(G)|. We let any two well-formed instances (Gi,A:i) and {G 2 ,k 2 ) be equivalent 
if |R(Gi)| = \V{G 2 )\ and ki = k 2 , i.e., they have the same number of vertices and ask for the 
same size of independent set. Clearly, equivalence can be checked in polynomial time and there 
are at most 1 -p |R(G)p < 1 -P equivalence classes on any finite set of instances of size at 
most N each. 

Now, given t 7^-equivalent instances of independent set, say, {Gi,k),... ,{Gt,k) where 
each Gi has exactly n vertices, we construct an ILP with a core part consisting of variables 
and constraints plus a set of roughly 4-protrusions. (Should the instances be malformed then 
the correct answer is no and we may return any constant-size infeasible ILP.) Our construction 
rules out any form of kernel or compression for the target problem to size polynomial in (n-plog f) 
(unless NP C coNP/poly). 

Construction. The core part of the ilpf instance is formed by a straightforward set of 
constraints that checks whether a set of n variables encodes an independent set of size at 
least k, subject to a set of edges that itself is given by variables. For ease of presentation, 
assume that all input graphs have the same vertex set V{Gi) = {1,... ,n} and (different) edge 
sets E{Gi) C 

• We introduce n variables xi,...,Xn that will encode an independent set; we constrain 
them by enforcing Xi G {0,1}. Accordingly, since all t instances seek an independent set 
of size at least k we add a constraint 


'^Xi>k. (3) 

i=l 

• We add ( 2 ) variables yij, for all 1 < f < j < n that will contain the adjacency information 
of a single input graph; we also enforce i/jj G {0,1}. For now we only add constraints that 
enforce that the independent set encoded in xi,..., is consistent with the information 
in the y-variables. For all 1 < i < j < n we enforce 

Xi -P Xj -P yij < 2. (4) 

• We add a single variable s that effectively chooses one input graph. We enforce s G 
{1,..., f}. This variable is used in the boundary of each protrusion. 

Now we describe the protrusion part of the instance. For each 1 < i < j < n we add addi¬ 
tional variables and constraints whose corresponding graph has treewidth four and such that 
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we retrieve the edge information for graph Gg- In the following, we explain how to build the 
constraints that encode the edge information for edge {i,j} foi' all graphs Gi,... ,Gt for some 
fixed choice of i and j. 

• We begin with adding indicator variables for p G {1,..., t} with the intention of 
enforcing 



if s = p, 
if s / p. 


We restrict these variables to domain {0,1}. Once we have these available it will be 
straightforward to enforce that yij correctly represents whether {i,j} ^ E{Gs), i-e., 
whether the graph Gg chosen by s G {1, ... ,t} contains the edge 

• As a first step, we add constraints enforcing that dp = 1 for all p with s ^ p. To this end 
we add the following constraints for all p G {1,..., t}. 


s > p — t ■ 
s < p + t 

Observe that p and t are constants in these constraints. Clearly, if dp^ = 0 then we must 
have s = p, so this can occur only for that particular variable dp^ = dg. Unfortunately, 
this alone does not suffice, since it does not prevent setting all variables ctp^ to one; we 
fix this in the next step. 

• As a second step, we add constraints that are equivalent to enforcing Ylp^^p^ = t — 1. 
(Note that we cannot outright add this constraint as its treewidth would be huge.) To 
this end, we use additional variables ..., cj’'^ with domain {0,1}. The constraints are 
as follows. 

4’^ = 4-1 + dy -1 G {2,..., t} 
cy = 0 

Adding up all equations except for = 0 and subtracting +.. . + 0 ]’^ from both sides 
yields 0 ^ = ^p d^ — (t — l). Thus, combined with cy = 0, this enforces ^p dy = t— 1. 

We observe that both sets of constraints together enforce the desired behavior for all 
variables d^y. Since s G {!,...,t}, we have t — l choices of p with s y p. For the 
corresponding variables dy we enforced that dy = 1. Due to the constraint ^p dy =t—l 
we must therefore have dy = 0 . 

• Now we may use the dy variables to enforce that the presence of edge {i,j} in graph Gg 
is correctly stored in yij. It suffices to add the following constraints (i.e., one constraint 
is added for each p G {1,... t} with the choice depending on whether {i, j} G E{Gp)). 


yi,j > 1 - dy if {i,j} G E{Gp), 
yi,j < 0 + dy if {i, j} ^ E{Gp). 
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Recall that yjj- has domain {0,1} and thus whenever dp^ = 1 there is no additional 
restriction for yij. In the (unique) case that dp^ = 0 (together with G {0,1}) this 
clearly enforces 


Vij 


1 ii{i,j]eE{Gp), 
0 {i,j} i E{Gp). 


Since we ensured that dp^ = 0 if and only if s = p this correctly enforces that yij carries 
the information about presence of {i,j} Gg, as desired. 

• Finally, let us check that the added variables together with yij and s indeed correspond 
to a protrusion in the corresponding Gaifman graph. Clearly, only s and yij occur in any 
further constraints so the boundary size is two. It can be checked that the treewidth of 
the subgraph induced by vertices of the present variables is at most four. (Key: Make a 
path decomposition of bags {s, yij,Cp_i, Cp\ for increasing p.) 

This completes our construction. Since we claim that the ilpf problem does not even admit a 
polynomial kernelization or compression when a protrusion decomposition is given along with 
the input, we have to construct a protrusion decomposition as part of the cross-composition. It 
is defined as follows. The set Yq has size n -|- ( 2 ) + 1 and consists of the variables Xi for i £ [n], 
variables yij for 1 < i < j < n, and the variable s. For each choice of 1 < i < j < n we create a 
set in the protrusion decomposition that contains the variables for the protrusion involving s 
and yij that was described above. The neighborhood of each such set K.. in Yq consists of s 
together with yij and therefore has size at most two. As argued above, the treewidth of the 
graph induced by the protrusion and its neighborhood is at most four. We need £ = ( 2 ) 
different sets in the protrusion decomposition. It is therefore a {E, 5)-protrusion decomposition 
for E = max(|lo|,.^) G 0 ( 11 ?). To bound the number m' of constraints that affect only the 
variables of Iq, observe that the only constraints whose support is a subset of Yq are the single 
constraint the ( 2 ) constraints Q, and the C>(|lo|) domain-enforcing constraints on Yq. We 
therefore have m! G O(n^). Since the only variable that does not have a binary domain is s with 
domain {1, ... ,t}, we have d := t. For the total parameter value of the constructed instance 
we therefore have E + m! + d G + logt), which is polynomial in the size of the largest 

input instance plus logt. It is therefore suitably bounded for a cross-composition. To complete 
the lower bound it suffices to prove that the constructed instance acts as the logical OR of the 
inputs. 

Correctness. We will keep this brief since we already discussed the workings of the created 
instance. If at least one graph Gg* has an independent set of size at least /c, then set the 
indicators xi accordingly and set s = s*. Furthermore, set the edge indicators according 
to E{Gs*). It remains to verify that there are feasible values for Cp^dp^ for all 1 < i < j < re 
and 1 < p < t. It can be verified that the following values are feasible (taking into account s = s* 
and the value of variables). 

^ [0 ifp>s. ^ 1^0 ifp = s. 

Conversely, we already discussed that for any choice of s the variables yij correspond to the 
presence of edge {i,j} Gg. Thus, the remaining constraints correctly verify the presence of 
an independent set of size at least A: in This implies that {Gg, k) is yes. This completes the 
proof of Theorem □ 
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Intuitively, the parameterization chosen in the theorem implies that everything can be 
bounded to size polynomial in the parameters except for the variables in Yi,... ,Yi and the 
(encoding size of the) constraints that are fully contained in protrusions (recall that constraints 
give cliques in G( 74 )). To put this lower bound into context, we prove that a more general (and 
less technical) variant is fixed-parameter tractable. 

Theorem 6. The following variant of ilpf is FPT.' We allow a constant number c of variables 
with polynomially bounded domain; all other variables have domain size d. Furthermore, there is 
a specified set of variables S C {xi,..., such that the graph G{A) — S has bounded treewidth. 

The parameter is d + IS"!. 

Proof. This follows readily from Theorem once we take care of the at most c high-domain 
variables. To do so, we can simply branch over all possible assignments to the variables since the 
total number of choices of c values from a polynomially bounded domain is itself polynomially 
bounded. Thus, for each choice we replace the high-domain variables by the chosen values and 
(after rearranging) obtain an ILP on slightly fewer variables that all have bounded domain. 
Now, let us note that the treewidth of G{A) is initially bounded by IS"! -|- 0(1) and that the 
Gaifman graph for the new ILP has at most the same treewidth (effectively it is obtained by 
deleting the vertices corresponding to high-domain variables). Thus, we may run the algorithm 
from Theoremwith w < l^l +0(1) to obtain the claimed result. □ 

4 Totally unimodular subproblems 

Recall that a matrix A is totally unimodular (TU) if and only if each square submatrix of A has 
determinant in {—1,0,1}; this requires that A G {—1,0,1}™'^” since any single entry defines 
a one by one square submatrix. If an ILP is given by Ax < b where A is totally unimodular 
and b is integral, then all extremal points of the corresponding polyhedron are integral. Thus, 
solving the relaxed LP suffices for feasibility and even for optimizing any function c^x subject 
to Ax < b. 

We say that a matrix A is totally unimodular plus p entries if A can be obtained from a 
totally unimodular matrix by replacing any p entries by new values. (This is more restrictive 
than the equally natural definition of adding p arbitrary rows or columns.) We note that ilpf 
is FPT with respect to parameter p + d, where d bounds the domain: It suffices to try all dP 
assignments for variables whose column in A has at least one modified entry. After simplification 
the obtained system is TU and, thus, existence of a feasible assignment for the remaining entries 
can be tested in polynomial time. 

Our following result shows that, despite fixed-parameter tractability for parameter p + d, 
the existence of a polynomial kernelization is unlikely; this holds already when d = 2. 

Theorem 7. ilp feasibility restricted to instances (A, b, p) where A is totally unimodular 
plus p entries does not admit a kernel or compression to size polynomial in p unless N P C coN P/poly, 
even if all domains are {0,1}. 

Proof. We reduce from the hitting SET(n) problem, in which we are given a set U, a set T" C 2^ 
of subsets of U, and an integer G N, and we have to decide whether there is a choice of at most k 
elements of U that intersects all sets in T"; the parameter is n := \U\. Dom et al. [9] proved 
that HITTING SET(n) admits no polynomial kernelization or compression (in terms of n) unless 
NP C coNP/poly. We present a polynomial-parameter transformation from hitting SET(n) to 
ilpf(p) with domain size 2. The ILP produced by the reduction will be of the form Ax < b 
where A is totally unimodular plus p = n entries. Thus, any polynomial kernelization or 
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compression in terms of p would give a polynomial compression for HITTING SET(n) and, thus, 
imply NP C coNP/poly as claimed. 

Construction. Let an instance {U,J^,k) be given. We construct an ILP with 0/1-variables 
that is feasible if and only if {U,iF,k) is yes for hitting SET(n). 

• Our ILP has two types of variables: Xu,f for all u £ U, F £ and Xu for all u £ U. For all 

variables we enforce domain {0,1} by Xu,f > 0, Xu,f < 1, > 0, and < 1 for u G [/ 

and F £ F. 

• The variables Xu,f ai'e intended to encode what elements of u “hit” which sets F £ F. 
We enforce that each set F £ F is “hit” by adding the following constraint. 

1 < Xu,F yF£F (5) 

u£F 


• The variables Xu are used to control which variables Xu,f may be assigned values greater 
than zero; effectively, they correspond to the choice of a hitting set from U. Control of 
the Xu,F variables comes from the following constraints. 

Xu,F < |-F| - Xu yu£U (6) 

F£F 

Additionally, we constrain the sum over all Xu to be at most k, in line with the concept 
of having Xu select a hitting set of size at most k. 

^Xu<k (7) 

u&U 


Clearly, the construction can be performed in polynomial time. Let us now prove correctness. 

Correctness. Assume that {U,F,k) is yes for hitting SET(n) and let S' C [/ be a hitting 
set for F of size at most k. Set all to 1 if u G S and to 0 otherwise; this fulfills Q. 
For each F £ F there is at least one uGSCTasSisa hitting set for F and we set the 
variable Xu,f to 1; all other Xu^f are set to 0. Thus, we satisfy all constraints ([^. Clearly, 
only Xu,F with u £ S (and hence = 1) receive value 1 and, thus, this assignment fulfills 
also ([^. Finally, we note that all variables receive values from {0,1}, as required. Thus, our 
assignment is a feasible solution for the ILP, as claimed. 

Conversely, assume that the ILP is feasible and fix any feasible assignment to all variables. 
Dehne a set 5 C [/ by picking those u £ U with Xu = I- Clearly, by Q and domain {0,1}, 
there are at most k such variables and, hence, |5| < k. We will prove that S' is a hitting set 
for F, so fix any set F £ F. By constraint ([^, at least one variable Xu,f with u £ F must take 
value 1. The corresponding element u must be in S since constraint plus non-negativity 
can only be fulfilled when x^ > 1 (as the left-hand side sum takes value greater than 0). By 
choice of S this implies u £ S. Overall we get that S contains some element u £ S Ci F for all 
sets F £ F, implying that S is indeed a hitting set for F. 

In the remainder of the proof, we show that the constraints can be written as Ax < b 
where A is totally unimodular plus n entries (here x stands for the vector of all variables Xu,f 
and Xu over all u £ U and F £ F). First, we need to write our constraints in the form 

( ^U ^ \ 

’ < 6, where, e.g., xy stands for the column vector of all variables Xu with u £ U. 

xu ) 

For now, we translate constraints ([^, Q, and ([^ into this form; domain-enforcing constraints 
will be discussed later. We obtain the following, where (1/0) and (—1/0) are shorthand for 
submatrices that are entirely 0 except for exactly one 1 or one —1 per column, respectively. 
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A'x = 


- 1/0 

1/0 

V 0---0 


0 

-l-^l 

0 

0 

1 


0 

-l-^l 

0 


0 

0 

0 

-m 


Xu,F 

Xu 


< 


1 / 


/-1\ 

-1 

0 

0 

\k) 


([5) 

(jei 

0 


In this expression, the columns of the matrix A' is are split into two groups. The first group 
contains the |J^| x \U\ columns for the variables Xu,f for u € U and F € while the second 
group contains the |f7| columns for the variables Xu with u £ U. Let us denote by A' the matrix 
obtained from A' by replacing all n entries — by zero. 


/(-I/O) 0 \ 

A' = ( 1 / 0 ) 0 

\o---o 

It is known that any matrix over { — 1,0,1} in which every column has at most one entry 1 
and at most one entry —1, is totally unimodular (cf. j2UL Theorem 13.9]). Since A^ is of this form, 
it is clear that AA is totally unimodular. To obtain the whole constraint matrix A we need to add 
rows corresponding to domain-enforcing constraints for all variables, and reset the — values 
that we replaced by zero. Clearly, putting back the latter breaks total unimodularity (and this 
is why A is only almost TU), but let us add everything else and see that the obtained matrix A 
is totally unimodular. The domain-enforcing constraints affect only one variable each and, thus, 
each of them corresponds to a row in A that contains only a single nonzero entry of value 1 
or —1. It is well known that adding such rows (or columns) preserves total unimodularity. (The 
determinant of any square submatrix containing such a row can be reduced to that of a smaller 
submatrix by expanding along a row that has only one nonzero of 1 or —1.) 

Finally, A and A are only distinguished by the n entries of value — that are present in A 
but which are 0 in A. Since A is totally unimodular, it follows that A is totally unimodular 
plus p = n entries, as claimed. □ 

Complementing Theorem we prove that TU subsystems of an ILP can be reduced to a 
size that is polynomial in the domain, with degree depending on the number of variables that 
occur also in the rest of the ILP. We again phrase this in terms of replacing boundaried ILPs 
and prove that any r-boundaried TU subsystem can be replaced by a small equivalent system 
of size polynomial in the domain d with degree depending on r. 

Lemma 2. There is an algorithm with the following specifications: (1) It gets as input an r- 
boundaried ILP {A,b,xti,... ,xt^) domain size d, with A G jrnxn^ ^ g jm^ such 

that the restriction of A to columns [m] \ {ti,... ,tr} is totally unimodular. (2) Given such 
an input it takes time 0{d^ ■ g(n,m) + d?^) where g{n,m) is the runtime for an LP solver for 
determining feasibility of a linear program with n variables and m constraints. (3) Its output 
is an equivalent r-boundaried ILP {A',fi, x'^t,..., x[,) of domain size d containing 0(r ■ dl) 
variables and constraints, and all entries of {A', b') in {—d,... ,d}. 

Proof. Similar to the treewidth case (Lemma this is a two-step process where we first deter¬ 
mine a set C of infeasible partial assignments (ai,..., a^), i.e., such that no feasible assignment 
for Ax < b has xt^ = ai,..., xt,. = Ur, and then encoding those directly using new constraints 
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and variables. The second part can be done just as for Lemma but we use a different routine 
for finding infeasible assignments since we no longer have bounded treewidth of G{A). 

Concretely, for every (oi,..., a^) G {0,..., d— 1}^ we set xt^ = oi,..., = ar and simplify 

the system to A'x' < b' where x' is the vector of all remaining variables. Here A' is simply A 
restricted to colnmns [n] \ {ti,... ,1^} and 

h'[i] = h[i] — A[i,j]aj, for i G [m]. 

We recall that A' is totally nnimodnlar and note that b' is integer, implying that we can test 
feasibility of A'x' < b' in polynomial-time nsing an LP-solver (e.g., the ellipsoid method). 

Again, this gives us a list of all infeasible assignments to ,..., xt^ and we proceed as in 
the proof of Lemma to ontput an equivalent ILP with r boundary variables. This completes 
the proof. □ 

Lemma implies that if Tq is a set of (at most) p variables whose removal makes the 
remaining system TU, then the number of variables in the system can efficiently be reduced to 
a polynomial in d -|- p with degree depending on r, if each TU snbsystem depends on at most 
r variables in Tq. To get this, it suffices to apply Lemmaonce for each choice of at most r 
boundary variables in Yq. (Note that without assuming a bonnded value of r we only know 
r < p, so the worst-case bound obtained is not polynomial, but exponential, in p -|- d.) 

5 Observations on TU subsystems with small boundary 

In this section, we discuss some of the differences between TU subsystems and those of bounded 
treewidth (both with small boundary); concretely, we argue that the former are somewhat easier 
to handle than the latter. To this end, we discuss how one can improve Lemmaj^in comparison 
to its bounded-treewidth counterpart (Lemma [^. For intuition, consider the case of having a 
single boundary variable only, say xi. Let 0 < i < j < d such that there are integer feasible 
assignments for the snbsystem with xi = i and xi = j, respectively. By convexity, this gives 
fractionally feasible assignments for all xi = i + 6{j — i), for 0 < d < 1. For every choice of d such 
that xi is integer, we get a fractionally feasible assignment for the subsystem. Since the system 
with value of xi plugged in simplifies to a TU system on the remaining variables, it follows that 
there are also integer feasible assignments that are consistent with this value of xi. Thus, the 
feasible values for xi are equal to {i, ■ ■ ■ ,j} for some 0 < i < j < d, which takes only O(logd) 
bits to encode (by two simple constraints). In the same way, we can rule out a lower bound 
as for the treewidth case (Theorem . There we used two boundary variables with domains 
{0,1} and {!,...,t}, and relied heavily on the fact that the feasible boundary assignments 
were not simply the integer points in a convex polytope; thus, we effectively encoded t bits of 
information. For the TU case, fixing the first variable yields a consecutive feasible interval for 
the second variable; the two intervals can be encoded in 0{logt) bits. 

In general, total unimodularity of the subsystem (minus the boundary) implies that we are 
only interested in the boundary assignments that have fractionally feasible completions in the 
subsystem. The argument of the previous paragraph generalizes to larger boundary sizes: if x 
and x' are feasible vectors of boundary assignments, then any convex combination x-|-d(x' — x) 
for 0 < d < 1 that is an integer vector can be extended to a feasible assignment. Let B denote 
the set that contains, for every feasible integer assignment, its projection to the boundary 
variables. Then B is closed under taking convex combinations that result in integer vectors. 
It follows that an integer bonndary assignment is feasible if and only if it lies in the convex 
hull of B. Thus, the (encoding) complexity of feasible assignments to the boundary is governed 
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by known upper and lower bounds for the integer hull of polytopes. According to a survey 
of Barany [SJ Theorem 7.1] the maximum number of vertices for an integer polyhedron in r 

V — 1 

dimensions whose points have coordinates from {0,..., d — 1} is which converges to 

0(d’'“^) (from above) for growing r. This implies that the feasible boundary assignments to 
an r-boundaried ILP of domain d, whose non-boundary variables induce a TU system, can be 

V — 1 

described in ©(d’^^rlogd) bits by writing down the coordinates of all the vertices. The same 
polyhedral bound can be used to show that fl(d ^) bits are necessary: An integral polyhedron 
in [0... d — 1]^ with N vertices forms the solution space of an ILP with r variables. Any single 
vertex of the polyhedron can be cut off from the feasible region by an additional constraint: Take 
a halfspace that intersects the polyhedron only at that vertex and move it slightly toward the 
interior. The resulting integer polyhedron is again described by an ILP. As there is a different 
ILP for each of the 2^ subsets of vertices that can be cut off, it follows that at least N bits 
are needed to describe such an ILP. Since there are integral polyhedra with r variables, domain 
{0,..., d — 1} and N G Q(d ’'+ 1 ) vertices, it follows that this number lower bounds the number 
of bits needed to encode an r-boundaried ILP whose remaining variables form a TU system. 
(Note that this lower bound applies even if the TU part is empty.) It is interesting to note 
the contrast with r-boundaried ILP’s of treewidth 0(r), for which 0(d’') bits are necessary by 
Proposition 1^ and trivially sufficient. 


6 Discussion and future work 

We have studied the effect that subsystems with bounded treewidth or total unimodularity have 
regarding kernelization of the ILP Feasibility problem. We show that if such subsystems have 
a constant-size boundary to the rest of the system, then they can be replaced by an equivalent 
subsystem of size polynomial in the domain size (with degree depending on the boundary size). 
Thus, if an ilpf instance can be decomposed by specifying a set of p shared variables whose 
deletion (or replacing with concrete values) creates subsystems that are all TU or bounded 
treewidth and have bounded dependence on the p variables, then this can be replaced by an 
equivalent system whose number of variables is polynomial in p and the domain size d. We point 
out that for the case of binary variables (at least in the boundary) the replacement structures 
get much simpler, using no additional variables and with only a single constraint per forbidden 
assignment. Using a similar approach and binary encoding for boundary variables should reduce 
the number of additional variables to 0{\ogd) per boundary variable. 

Complementing this, we established several lower bounds regarding limitations of replacing 
such subsystems. Inherently, the replacement rules rely on having subsystems with small bound¬ 
ary size for giving polynomial bounds. We showed that this is indeed necessary by giving lower 
bounds for fairly restricted settings where we do not have the guarantee of constant boundary 
size, independent of the means of data reduction. In the case of treewidth we could also show 
that boundaries with only one large-domain variable can be a provable obstacle. For the case of 
totally unimodular subsystems the discussion in the previous section shows that these behave in 
a slightly simpler way than bounded-treewidth subsystems: By ad hoc arguments we can save 
a factor of d in the encoding size by essentially dropping the contribution of any one boundary 
variable; thus we rule out a lower bound proof for the case of one boundary variable having 
large domain. Asymptotically, we can save a factor of almost d?, which is tight. It would be 
interesting whether having two or more large domain variables (in a boundary of constant size) 
would again allow a lower bound against kernelization. 

A natural extension of our work is to consider the optimization setting where we have 
to minimize or maximize a linear function over the variables, and may or may not already 
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know that the system is feasible. In part, our techniques are already consistent with this since 
the reduction routine based on treewidth dynamic programming or optimization over a TU 
subsystem can be easily augmented to also optimize a target function over the variables. A 
technical caveat, however, is the following: If we simplify a protrusion, then along with each 
feasible assignment to the boundary, we have to store the best target function contribution 
that could be obtained with the variables that are removed; this value can, theoretically, be 
unbounded in all other parameters. If a binary encoding of such values is sufficiently small (or 
if the needed space is allowed through an additional parameter), then our results also carry over 
to optimization. Apart from that, a rigorous analysis of both weight reduction techniques and 
possible lower bounds is left as future work. 
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